Skip to content

Added tests for enumpoly, LP and LCP#705

Merged
tturocy merged 18 commits intomasterfrom
nash_tests
Dec 19, 2025
Merged

Added tests for enumpoly, LP and LCP#705
tturocy merged 18 commits intomasterfrom
nash_tests

Conversation

@StephenPasteris
Copy link
Copy Markdown
Collaborator

Added tests for enumpoly, LP and LCP

@rahulsavani rahulsavani self-requested a review December 17, 2025 08:12
Copy link
Copy Markdown
Member

@rahulsavani rahulsavani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new games have been added without being consistent with what is already there:

  1. a version of stripped down poker is created in a new function in tests/games.py but that new code still refers to Myerson not Reiley; moreover it ignores the existing create_stripped_down_poker_efg. For reconciling create_one_card_poker_lacking_outcome_efg() with create_stripped_down_poker_efg it seems that you might want to add a third variant. In that case you might want to go from a boolean flag in to some kind of "type" argument, which if None would give the standard one.

  2. the new games added with non-terminal or missing outcomes etc. do not have variants with all terminal outcomes; note that the existing functions from Master for the relevant tests with and without non-terminal outcomes (create_stripped_down_poker_efg, create_matching_pennies_efg, and create_kuhn_poker_efg) all have an argument that allows to create a standard version with only (and all) terminal outcomes and a non-standard one. The tests then include a case for the standard version, and a xfail variant created via the same function in game with an appropriate argument.

Here's an example:

        (
                games.create_2x2_zero_sum_efg(),
                [[["1/2", "1/2"]], [["1/2", "1/2"]]]
        ),
        pytest.param(
            games.create_2x2_zero_sum_efg(missing_term_outcome=True),
            [[["1/2", "1/2"]], [["1/2", "1/2"]]],
            marks=pytest.mark.xfail(reason="Problem with missing terminal outcomes")
        ),

Here's another example:

        (games.create_matching_pennies_efg(),
         [[["1/2", "1/2"]], [["1/2", "1/2"]]]),
        pytest.param(
            games.create_matching_pennies_efg(with_neutral_outcome=True),
            [[["1/2", "1/2"]], [["1/2", "1/2"]]],
            marks=pytest.mark.xfail(reason="Problem with missing outcomes")
        ),

There's also examples for LP and LCP solve in mixed behaviors for stipped-down poker and Kuhn poker.

In these examples the two versions, with and without non-terminal outcomes, may or may not give the same answer (note e.g. that currently kunh_poker_lcp_first_mixed_strategy_prof() is used in the xfailing and standard tests. I believe you have an example (maybe one of these games even) where making such a variation changed the output of the LP or LCP solver, which is not impossible (namely that the solver finds a different solution even though the mathematical program getting solved is equivelent in terms of the set of solutions).

I presume you put the "expected outputs" for the xfailing variants based on a standard variant anyhow? Or did you already have working code for these xfailing variants? Either way, please add tests for the standard variants as well.

Note that we also include for stripped-down poker and Kuhn the terminal-only and non-terminal examples for solving in mixed strategies, here from test_lcp_strategy_rational:

        (games.create_stripped_down_poker_efg(), [[["1/3", "2/3", 0, 0], ["2/3", "1/3"]]], None),
        (
                games.create_stripped_down_poker_efg(nonterm_outcomes=True),
                [[["1/3", "2/3", 0, 0], ["2/3", "1/3"]]],
                None
        ),
        (games.create_kuhn_poker_efg(), [games.kuhn_poker_lcp_first_mixed_strategy_prof()], 1),
        (
                games.create_kuhn_poker_efg(nonterm_outcomes=True),
                [games.kuhn_poker_lcp_first_mixed_strategy_prof()],
                1
        ),

pytest.param(
games.create_perfect_info_internal_outcomes_efg(),
[[[0, 1], [1, 0]], [[0, 1], ["1/2", "1/2"]]],
marks=pytest.mark.xfail(reason="Problem with internal outcomes")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case that starts on line 465 actually xpasses rather than xfails. If you are happy with this test as being useful then by all means keep it but remove the the xfail mark. If you think a similar test to this one can be made to xfail, add that variant.

tests/games.py Outdated
return g


def create_one_card_poker_lacking_outcome_efg() -> gbt.Game:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have based this on old code. See PR #608.

We only have "stripped_down_poker" in the test suite and don't talk about Myerson at all, but Reiley et al.

Moreover we already create two versions of this game in the test suite, see:

create_stripped_down_poker_efg(nonterm_outcomes: bool = False)

[[[0.0, 1.0], [1.0, 0.0]], [[0.0, 1.0], [1.0, 0.0]]],
[[[0.0, 1.0], [0.0, 1.0]], [[0.0, 0.0], [1.0, 0.0]]]],
6,
marks=pytest.mark.xfail(reason="True equilibrium has tiny probabilities")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow: You say the true equilibrium has tiny probabilities but all the expected equilibria above are 0.0 or 1.0 (are they the true equilibria or not)?

Does this fail if the max_regret check is removed?

Copy link
Copy Markdown
Member

@rahulsavani rahulsavani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better, but before I carefully check all the cases can you please use everywhere in the expected profiles:

  • 1 and 0 instead of 1.0 and 0.0
  • fractions as strings instead of floating point long fractions, e.g. "1/3" instead of 0.333333 and "2/3" instead of 0.66666667.

The tests already do a floating point comparison, and we should encode the exact equilibria as the target of what we would like to find, even for enumpoly which only works in float. Note all the other tests are done using integers without quotes or fractions in quotes. Please stay consistent with that.

@rahulsavani
Copy link
Copy Markdown
Member

Actually, to save me time, even if we don't keep it, if you can add a final test that just shows how for each pairs of variants they have the same reduced strategic form, e.g, with something like:

    old = games.create_kuhn_poker_efg(nonterm_outcomes=True)
    new = games.create_kuhn_poker_efg(nonterm_outcomes=False)
    for i in [0, 1]:
        assert (old.to_arrays()[i] == new.to_arrays()[i]).all()

but using @pytest.mark.parametrize( so that we can do all your new pairs of games in a single test. We can decide after if we want to keep this in the test suite but I certainly prefer to check that the two variants are really identical (which I already did for stripped-down poker, Kuhn, and the small 2x2 bimatrix games with non-terminal outcome / missing terminal outcome variants).

@rahulsavani
Copy link
Copy Markdown
Member

I moved the check that the reduced strategic forms match to test_extensive.py. I consider these useful (consistency) tests for checking the the RSF is correctly computed when there are non-terminal/missing outcomes.

I checked all the pairs of standard/modified games. For all but one, they are as expected, copying from the new comment in test_reduced_strategy_form_nonterminal_outcomes_consistency:

    standard: game uses only non-terminal outcomes, with all non-terminal nodes having outcomes
    modified: is payoff equivalent, but with non-terminal outcomes or missing terminal outcomes

    The test checks that the corresponding reduced strategic forms match.

The one that doesn't adhere to this is "3_player_with_internal_outcomes" because it does not have all terminal outcomes in the standard variant. Please fix.

Otherwise, I am now convinced the new games are all meaningful tests.

There's a lot of non-optimal pygambit code in creating these new games in tests/games.py, and also perhaps more new games than we needed (we could have made variants of existing games instead). As such, let's just wait for input from @tturocy before merging.

@tturocy tturocy merged commit ed06f43 into master Dec 19, 2025
28 checks passed
@tturocy tturocy deleted the nash_tests branch December 19, 2025 10:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants